Regularized Weighted Linear Regression for High-dimensional Censored Data
نویسندگان
چکیده
Survival analysis aims at modeling time to event data which occurs ubiquitously in many biomedical and healthcare applications. One of the critical challenges with modeling such survival data is the presence of censored outcomes which cannot be handled by standard regression models. In this paper, we propose a regularized linear regression model with weighted least-squares to handle the survival prediction in the presence of censored instances. We also employ the elastic net penalty term for inducing sparsity into the linear model for effectively handling high-dimensional data. As opposed to the existing censored linear models, the parameter estimation of our model does not need any prior estimation of survival times of censored instances. In addition, we propose a self-training framework which is able to improve the prediction performance of our proposed linear model. We demonstrate the performance of the proposed model using several real-world high-dimensional biomedical benchmark datasets and our experimental results indicate that our model outperforms other related competing methods and attains very competitive performance on different datasets.
منابع مشابه
Estimation in the l1-Regularized Accelerated Failure Time Model
This note variable selection in the semiparametric linear regression model for censored data. Semiparametric linear regression for censored data is a natural extension of the linear model for uncensored data; however, random censoring introduces substantial theoretical and numerical challenges. By now, a number of authors have made significant contributions for estimation and inference in the s...
متن کاملRegularized Parametric Regression for High-dimensional Survival Analysis
Survival analysis aims to predict the occurrence of specific events of interest at future time points. The presence of incomplete observations due to censoring brings unique challenges in this domain and differentiates survival analysis techniques from other standard regression methods. In many applications where the distribution of the survival times can be explicitly modeled, parametric survi...
متن کاملNon-Asymptotic Oracle Inequalities for the High-Dimensional Cox Regression via Lasso.
We consider finite sample properties of the regularized high-dimensional Cox regression via lasso. Existing literature focuses on linear models or generalized linear models with Lipschitz loss functions, where the empirical risk functions are the summations of independent and identically distributed (iid) losses. The summands in the negative log partial likelihood function for censored survival...
متن کاملh . ST ] 3 M ay 2 01 4 Censored linear model in high dimensions
Censored data are quite common in statistics and have been studied in depth in the last years (for some early references, see Powell (1984), Muphy et al. (1999), Chay and Powell (2001)). In this paper we consider censored high-dimensional data. High-dimensional models are in some way more complex than their lowdimensional versions, therefore some different techniques are required. For the linea...
متن کاملRandom rotation survival forest for high dimensional censored data
Recently, rotation forest has been extended to regression and survival analysis problems. However, due to intensive computation incurred by principal component analysis, rotation forest often fails when high-dimensional or big data are confronted. In this study, we extend rotation forest to high dimensional censored time-to-event data analysis by combing random subspace, bagging and rotation fo...
متن کامل